Many data are naturally modeled by an unobserved hierarchical structure. Inthis paper we propose a flexible nonparametric prior over unknown datahierarchies. The approach uses nested stick-breaking processes to allow fortrees of unbounded width and depth, where data can live at any node and areinfinitely exchangeable. One can view our model as providing infinite mixtureswhere the components have a dependency structure corresponding to anevolutionary diffusion down a tree. By using a stick-breaking approach, we canapply Markov chain Monte Carlo methods based on slice sampling to performBayesian inference and simulate from the posterior distribution on trees. Weapply our method to hierarchical clustering of images and topic modeling oftext data.
展开▼